REMARKS 

Claims 18-27, 29-33, 35-57, 65, and 69 are pending in the 
Application and are rejected under 35 U.S.C. §1 03(a) based upon the 
references of Matsuura, et al. . U.S. Patent No. 7,136,684 and Varma, et 
a!, U.S. Patent Application Publication No. US2004/0213419. 

Rejections Under 35 U.S.C. §1 03(a) 

The pending claims are rejected as being rendered obvious by 
the combination of Matsuura, et al. in view of Varma, et al. Matsuura, et 
a! was an earlier cited reference, and the claims were amended during 
prosecution to further define the invention, and arguments were 
presented pointing out how the Matsuura, et al. reference does not teach 
the features of the invention. 

In the current Office Action, the Examiner argues that it is inherent 
within the Matsuura, et al. reference that the sampled representations 
processed by the headset would include user speech as opposed to 
extraneous noise. However, such a feature is not inherent because the 
headset would process both user speech and extraneous noise together. 
It does not care about content. Matsuura, et al. transmits the audio 
signal when the user is speaking, but does not make any determination 
with respect to the content of that audio signal, and whether that audio 
signal has speech words in its content, or is just noise. 
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Specifically, the Matsuura, et al. reference discloses a headset 
and recording system for recording the utterances of a wearer, such as a 
doctor, for example. The system captures the utterances of a user 
wearing a headset and transmits them, such as for two-way radio 
communication, and then records those utterances in a memory unit. 
Matsuura, et al. does not care about the content of the utterance or 
whether it is speech. It could be simple mumbling or laughter by the 
user, and the Matsuura, et al. headset would still transmit the captured 
audio. Whatever sound is captured by the Matsuura, et al. headset and 
the microphone is transmitted automatically, and is recorded for future 
processing and use. The recorded audio may then be replayed and 
reviewed. 

All noises and audio signals captured by the Matsuura. et al. 
microphone are transmitted to be recorded. As noted in Column 12, 
Lines 1-5, and again in Column 16, Lines 1-10, communication contents, 
such as those spoken into a headset are automatically stored in the 
memory unit, and thus, are automatically transmitted from the headset to 
the memory unit. That is, as noted in Column 16, while recordings of 
audio captured by the headset may be selectively turned on and off, the 
present invention presupposes that, when the headset is worn and the 
communications are started, the communication contents, such as 
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conversation contents, are automatically recorded. In any case, the 
audio signals are always transmitted by the headset, regardless of their 
speech content. 

Therefore, all the sounds captured by the Matsuura, et al. headset 
are generally transmitted and may or may not be recorded. Whatever 
audio is captured, that audio will be processed and sent utilizing a radio 
communication module. But Matsuura, et al. teaches no way to know 
the actual content of the captured audio and whether it is speech or 
unintelligible noise. 

The Examiner refers to the Varma, et al. reference in order to 
somehow teach that the sampled representation includes user speech 
as opposed to extraneous noise. However, Varma. et al. is directed to a 
noise reduction system for voice applications directed to gaming 
consoles. The combination of Matsuura. et al. and Varma. et al. 
certainly would not provide the teaching to a person of ordinary skill in 
the art to render the present invention as obvious. 

Specifically, the claims of the invention have been further 
amended to clarify their scope. Specifically, although Claims 18-28 are 
cancelled, Claim 29 has been amended to further recite a system for 
wireless communications using speech recognition. In speech 
recognition, as discussed in the current Application, various steps of 
processing take place in order to determine the actual speech content of 
an audio signal that may contain user speech or spoken words. The 
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content, for example, would be specific words that are spoken and then 
recognized by the speech recognition for various purposes. To that end, 
there is some initial processing of the audio signals in that headset, 
referred to as front-end speech recognition that occurs prior to the 
additional speech recognition processing that might occur in the other 
device. The additional processing might include, for example, further 
code book lookup steps and/or pattern matching in order to determine 
what actual words are spoken. This later processing is referred to in the 
present Application as back-end speech recognition speech processing. 

The present invention provides a system for wireless 
communication using speech recognition. Referring to Claim 29, that 
system includes a device that is configured for processing speech 
signals using speech recognition circuitry wherein the device includes at 
least some back-end speech recognition processing circuitry. As 
claimed, the system of Claim 29 further comprises a headset configured 
for performing front-end speech recognition processing by initially 
forming sampled spectral transforms of the captured audio signal, and 
processing those sampled spectral transforms using speech detection 
circuitry to determine that the captured audio signals include user 
speech as opposed to extraneous noise. 

Therefore, the headset recited in Claim 29 performs a specific 
speech recognition process in order to determine whether there are 
actual spoken words or speech within the audio signals captured by the 
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microphone in the headset such that they should be further processed 
using back-end speech recognition. Furthermore, the invention 
processes the captured audio into a form (sampled spectral transforms) 
that then might be utilized for more efficient back-end speech 
processing. If the headset determines, through the front-end speech 
recognition processing, that the content of the captured audio signals 
include user speech as opposed to some extraneous noise, then it will 
transmit the signals for the back-end speech processing. If the headset 
determines, by the front-end speech recognition processing, that the 
captured audio is simply noise, rather than actual user speech words, 
the headset does not allow for the transmission of the sampled spectral 
transforms to the device for further processing. If the front-end speech 
recognition processing of the sampled spectral transforms does 
determine that the captured audio signals include user speech as 
opposed to the extraneous noise, the headset has switching circuitry that 
facilitates wirelessly transmitting the sample spectral transforms to the 
device for use of those spectral transforms to complete the speech 
recognition of the system. 

Again, as noted above, the elements called forth by the Examiner 
in Matsuura, et al. simply refer to a microphone and an element called a 
speech detection unit 36 and encoding hardware 38. In the Matsuura, et 
a! reference, the speech detection unit 36 is only disclosed as 
converting the speech signal generating by the microphone into a digital 
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signal. It does not provide any front-end speech recognition processing, 
and does not form sampled spectral transforms of the captured audio 
signals in order to process those sampled spectral transforms to 
determine that the audio signals include user speech as opposed to 
extraneous noise. The speech detection unit 36 of Matsuura. et al. 
merely converts the captured audio signals into digital signals. Those 
digital signals are transmitted unconditionally when spoken, and there is 
no analysis or processing of those audio signals to form sampled 
spectral transforms. Nor is there any analysis or processing of those 
sampled spectral transforms. Rather, everything captured by 
microphone 17 is transmitted by circuit 36. Such indiscriminate 
transmission utterly defeats one of the purposes of the present invention, 
which is to selectively forward signals only when they include spoken 
words, as determined by the speech recognition processing. 

The Matsuura, et al. reference is primarily directed to recording 
audio signals. As earlier pointed out, in Column 8, Lines 1-10 of the 
Matsuura, et al. reference, the system presupposes the basic constant 
recording where the recording is made automatically while the user 
wears the headset and carries out communications. There is simply no 
teaching of the selective features of the invention, including a headset 
with switching circuitry, that is operable to facilitate selectively wirelessly 
transmitting the sampled spectral transforms of the captured audio 
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signals when user speech is detected from those spectral transforms, 
but not allowing transmitting to the device when user speech is not 
detected from those spectral transforms. 

Nor does the Varma, et al. reference make up for the lack of 
teaching in Matsuura, et al. such that a combination of those two 
references would render obvious the present invention. Varma. et al. is 
referred to by the Examiner for teaching a device and the use of a 
sampled representation, which includes user speech as opposed to 
extraneous noise. Essentially, the Varma, et al. reference is directed to 
a noise reduction system for a gaming device. However, the present 
invention is not directed to noise reduction. In fact, if the captured audio 
in the headset of the invention is analyzed using the front-end speech 
recognition processing and found to contain speech or spoken words, 
that audio would be transmitted even if it did contain noise, because it 
also contains the desired speech. The invention is looking for 
recognized speech, even if that speech has noise in it. It just does not 
transmit noise alone without speech. Matsuura, et al. will always 
transmit when there is a sound, regardless if that sound is spoken words 
or not. The Varma, et al. reference provides no teaching with respect to 
speech recognition. Rather, it teaches a noise reduction system that has 
an array of microphones that are used to recognize noise from known 
locations and to filter that noise from the known locations so that any 
voice communications is free from the undesirable noise. 
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It is unlikely that a person of ordinary skill in the art would even 
refer to the Varma, et al. reference to somehow modify the Matsuura, et 
al. reference. Even if those two references were combined, all that 
would be achieved is to provide some noise reduction in the headset of 
Matsuura, et al. Such a combination still would not teach a system as 
recited in Claim 29 that provides wireless communications using speech 
recognition, wherein a device is configured for using speech recognition 
circuitry that includes at least some back-end speech recognition 
processing circuitry, and a headset captures audio signals and performs 
front-end speech recognition processing by initially forming sampled 
spectral transforms and processing those sampled spectral transforms to 
determine if the captured audio signal includes user speech. Nor is 
there a teaching of a headset that includes switching circuitry to 
selectively transmit the sample spectral transforms for further completion 
of the speech recognition by back-end speech recognition processing 
circuitry, only when the captured audio signals include user speech, as 
determined by the front-end recognition processing of the spectral 
transforms. 

Therefore, a combination of Matsuura, et al. and Varma, et al. 
does not teach all the limitations recited in Claim 29. Thus, that claim 
cannot be rendered obvious under 35 U.S.C. §1 03(a). Accordingly, 
Claim 29 is allowable. Dependent Claims 31-33, 39-41 , and 43-44 each 
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depend from Claim 29 and thus, would be allowable for similar reasons. 
Furthermore, each of those claims recites a unique combination of 
elements, which is not taught by the cited art. 

Claim 45 is directed to a method for wireless communication 
between a headset and a device using speech recognition. Claim 45 
further recites the step of processing the captured audio signals, and 
performing front-end speech recognition by forming sampled spectral 
transforms of the captured audio signals. Claim 45 further recites using 
speech detection circuitry to analyze the sampled spectral transforms to 
determine that the captured audio signals include user speech as 
opposed to extraneous noise. Furthermore, Claim 45 recites using 
switch circuitry for selectively wirelessly transmitting sampled spectral 
transforms of the captured audio signals when user speech, rather than 
noise, is detected from those spectral transforms, but not transmitting 
when user speech is not detected. Claim 45 also recites that the device, 
using back-end speech recognition processing circuitry to process the 
spectral transforms that are transmitted by the headset to complete the 
speech recognition. For the reasons discussed hereinabove, the 
method as recited in Claim 45 also would not be taught by the two prior 
art references of Matsuura, et al. and Varma, et al. such that those 
references would render obvious Claim 45. Accordingly, Claim 45 is 
allowable. Claims 48-49, 51-52, and 54-57 each depend from Claim 45, 
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and thus, would be allowable as well. Furthermore, each of those claims 
recites a unique method or process, including a combination of steps, 
which is not taught by the cited prior art. 

Claim 58 has been amended and recites headsets for 
communication with a remote device for use in speech recognition. The 
headset is recited as comprising front-end speech recognition circuitry 
that forms sampled spectral transforms of the captured audio signals in 
order to reduce the amount of microphone system output data that is 
communicated. Claim 58 further recites switching circuitry coupled with 
the front-end speech recognition circuitry and configured to facilitate 
selectively transmitting the sampled spectral transforms when user 
speech is detected, and not transmitting when user speech is not 
detected. Claim 58 further recites that the sampled spectral transforms 
are in a form usable by back-end speech recognition processing to 
complete the speech recognition. For the reasons discussed 
hereinabove with respect to Claim 29, Claim 58 is also in an allowable 
form, and is not rendered obvious by the cited combination of prior art 
references. Each of Claims 65 and 69 depend from Claim 58, and would 
be allowable for the same reason. Furthermore, those claims recite a 
unique combination of elements, which are not rendered obvious by the 
prior art. 
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In light of the foregoing, it is respectfully submitted that the 
present application is in a condition for allowance and notice to that 
effect is hereby requested. If it is found that the present amendment 
does not place the application in a condition for allowance, Applicant's 
undersigned attorney requests that the examiner initiate a telephone 
interview to expedite prosecution of the application. 

Applicants are submitting the fee due for the two-month extension 
of time with this response. If any additional fees are necessary, the 
Commissioner may consider this to be a request for such and charge 
any necessary fees to deposit account 23-3000. 
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